Picture for Shanghang Zhang

Shanghang Zhang

RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration

Add code
May 06, 2025
Viaarxiv icon

CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation

Add code
May 04, 2025
Viaarxiv icon

Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

Add code
May 03, 2025
Viaarxiv icon

ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance

Add code
Apr 23, 2025
Viaarxiv icon

EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Sampler

Add code
Apr 13, 2025
Viaarxiv icon

Segment Any Motion in Videos

Add code
Mar 28, 2025
Viaarxiv icon

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning

Add code
Mar 27, 2025
Viaarxiv icon

MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation

Add code
Mar 26, 2025
Viaarxiv icon

EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?

Add code
Mar 19, 2025
Viaarxiv icon

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Add code
Mar 13, 2025
Viaarxiv icon